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ABSTRACT 
Various pharmacological, genetic and immunity related reasons contribute to the recorded Adverse 
Drug Reactions (ADR) either directly or indirectly. The genetic factors are not limited to ethnicity, 
age and demographics, social and economical and gender factors. Such factors are considered to be 
the secondary reasons, whereas the primary reasons indicate the interactions between drug-drug, 
drug-protein and protein-protein entities. With advanced algorithms in Machine Learning and Data 
Science, the prediction of pharmacovigilance has reached greater heights in the recent past. The 
conventional methods are time-consuming and demands huge intervention from experts and 
manufacturers. The models built with machine learning models have simplified the analysis and 
regression models have identified serious adverse drug reactions better than conventional methods. 
Allergies, organ failures and haemorrhages are considered for this research work, focusing on the 
parameters such as age, number of medications consumed, number of illnesses affecting the specific 
patient, dosage, type of medical institution, previous or genetic history of adverse reactions to 
medicines, type of consumptions and method of medication intake. From the investigative results, 
elderly patients affected by multiple illness are bound to multiple medicine intake and thus are 
subjected to serious adverse drug reactions. Owing to the fact, the monitoring period should be 
shortened and supervised accordingly. Primary responsibility of medical institutions lies in 
monitoring the previous history of adverse reactions and the new symptoms upon consumption of 
new medicine. The proposed approach carefully studies the relationships between various factors and 
computes in a binary logistic model for effective detection and prediction. The outcomes of the 
proposed model justify the need for additional parameters for a promising accuracy in detection and 
prediction of ADRs. 
Keywords—Adverse drug reaction, machine learning, binary logistic regression, 
demographics, elderly. 


1. Introduction 

As the research and use of medications has expanded, adverse drug reactions (ADRs) have steadily 
become a public issue. ADRs are unrelated or unexpected adverse responses to qualifying 
medications when used and dosed normally. Serious adverse drug reaction (SADR), a heterogeneous 
reaction unrelated to typical pharmacological effects, cannot be detected by traditional toxicological 
screening, has a low incidence, is delayed, dose-independent, and unpredictable [1]. When ADRs 
occur, various organs throughout the body are affected, putting patients' lives and safety at risk. ADRs 
endanger patients’ lives and health, and also waste a lot of medical resources. From 1966 to 1996, an 
average of 6.7 out of every 100 hospitalised patients in the United States had SADRs, with a death 
rate of 10%. Each hospitalised patient's average duration of stay is extended by two days as a result 
of ADRs, and the average cost rises by $2500 [2]. The number of ADR reports in China has risen 
dramatically in recent years. In 2020, the China Adverse Drug Reaction Monitoring System received 
1.676 million ADR reports (1251 instances per million population), with Severe ADRs accounting 
for 10% of them. SADRs raise the expense of medical treatment for patients, may cause treatment 
delays, and have a negative impact on patients' quality of life. Severe ADRs also cause patients to 
lose trust in doctors, prompting both sides to become embroiled in medical disputes and exacerbating 
the already strained doctor-patient relationship [3]. 
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ADR has emerged as one of the primary variables contributing to the unpredictability of clinical drug 
research and development, with the potential to halt research and development owing to patient harm. 
ADRs have a negative impact on patient health and the operation of medical institutions [4]. 
Pharmaceutical producers, drugstores, and medical institutions are among the users, and over 16.87 
million ADR/event reports are expected to be gathered by 2020. ADR reporting and monitoring have 
grown significantly, and the number of reports and reporting rates are increasing, giving data for this 
study. As a result, this study examined the elements that contribute to the incidence of SADRs, 
identified the factors that influence the prognosis of patients with severe adverse reactions at various 
levels of medical facilities, and finally offered appropriate suggestions for monitoring. Machine 
learning approaches can be used to interpret various data types in order to anticipate ADRs [5-6]. 
These methods make use of a variety of input data, including chemical structures, gene expressions, 
and text mining. These data kinds are then algorithmically analysed to produce prediction models 
using random forest (machine learning) or an artificial neural network (deep learning). Deep learning, 
a subset of machine learning in artificial intelligence (AI), has emerged as a promising and highly 
effective method for combining and interrogating multiple biological data types in order to develop 
novel hypotheses [7]. Deep learning is widely employed in drug development and repurposing; 
nevertheless, its applications in ADR prediction using gene expression data are restricted. 

Open TG-GATESs is a large-scale toxicogenomics database that collects gene expression patterns 
from in vivo and in vitro drug-treated samples. These expression profiles are the result of the Japanese 
Toxicogenomics Project, which aimed to create a large database of drug toxicities to aid in drug 
discovery. It also collects physiological, biochemical, and pathological measurements of the treated 
animals [8]. Similar databases aimed at profiling substance toxicity have also been created. Unlike 
previous databases, such as (LINCS), which were used to predict many ADRs in a single trial, Open 
TG-GATEs was utilised to analyse individual/specific toxicities. No attempt has been undertaken, to 
the best of our knowledge, to develop a general framework for anticipating multiple ADRs. The Open 
TG-Gates database design has various advantages over the LINCS database, most notably the 
inclusion of in vivo samples with varying dosages and durations of administration. As a result, we 
planned our analysis to include numerous samples of each chemical with varying dosages and 
durations, necessitating additional noise-removal processes in the data processing [9]. This paper 
demonstrates how we developed deep learning-based, systematic ADR prediction models. This 
method integrates ADR incidence data from the FAERS (FDA Adverse Event Reporting System) 
database, including frequency details, with gene expression profiles from Open TGGATEs. We 
demonstrate how feature selection and hyperparameter optimization strategies can be used to increase 
model performance [10]. The approaches and models proposed in our work are useful tools for 
predicting the likelihood of ADRs in the field. 

The FDA Adverse Event Reporting System (FAERS) is "a database that gathers adverse event 
reports, medication error reports, and product quality concerns that resulted in adverse events that 
were submitted to FDA" (https://open.fda.gov/data/faers/) [15]. However, because the phrases used 
in the FAERS database are left up to the reporter's discretion, erroneous descriptions may frequently 
be included, such as using broad, ambiguous terms to describe adverse occurrences or treatments. For 
11 years (2004-2015), they curated and standardised FAERS database entries using Medical 
Dictionary for Regulatory Activities (MedDRA) specified keywords (PT). From the total number of 
reports, we retrieved all of the compound-ADR combinations (70,553,900). (4.8 million) [11]. The 
availability of reports with various medicines is one of the challenges of using the FAERS database 
in ADRs prediction algorithms. 


2. Background Study 

The in-vivo gene expression patterns of rat liver were retrieved. TG-GATEs samples from the Open 
TG-GATEs database We chose the rat in-vivo data for our study primarily because it contained more 
chemicals and a higher number of time points than the in-vitro data (rat and human). However, our 
technique is easily adaptable to various datasets. This dataset included both single-dose and repeated- 
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dose studies [12]. The administration-to-sacrifice times in single-dose tests were 3, 6, 9, or 24 hours, 
whereas medicines were supplied to rats once daily for 4, 8, 15, or 29 days in repeated dosage 
experiments. All animals in the repeated-dosage tests were euthanized 24 hours following the final 
injection. Microarray technology was used to evaluate gene expression patterns in Open TG-GATEs 
(Affymetrix GeneChip). 

The Affymetrix CEL data were obtained from the website http://toxico.nibiohn.go.jp, and they were 
pre-processed with the affy package [13]. Version 5 (mas5) was used with the default settings. given 
by affy, with normalisation TRUE The resultant normalised dataset—hereafter referred to as "the raw 
data". All following analyses relied on the dataset. The fold comes next. For each probe set, change 
values were computed by dividing the raw dataset multiplied by the mean intensities of the relevant 
controls samples; these values were then log2 converted, as shown below Known as the "log2FC 
dataset." To guarantee that the data was clean and comprehensive, it was cleaned and pre-processed 
[14]. A total of 571 326 initial data points were collected. When an ADR happens, everyone records 
a code, however there are times when the same code contains numerous entries for various medicines. 
To reduce duplicate data, we utilise Excel to guarantee that each code keeps one record, and 394 037 
ADR records were kept [15]. The report's year, age, gender, proportion of significant adverse 
reactions, and adverse reaction outcomes were submitted to descriptive analysis and the chi-square 
test. The logistic regression method was used to investigate the parameters influencing the prognosis 
of SADRs at various levels of medical facilities. The SPSS 24.0 programme was used for all data 
analysis. 

Because the experimental design comprised a variety of doses and depending on the period of 
exposure, the medications had varying impacts on the gene. Profiles of expression We anticipated all 
of the outcomes to decrease noise. Using a generalised linear model, samples will be treated or not 
treated [16]. Lasso regularisation model [GLMNET package from R]. The entire raw dataset was 
utilised as the binary classification training set (treated and control). We supplied all microarray data 
from the same time period to a single model, one model for each exposure length set Following that, 
we calculated the likelihood of being classed as a treated. Only those samples that have a sample from 
each training set includes probabilities greater than 92% [17]. From the entire number of reports, we 
retrieved all of the compound-ADR combinations (70,553,900). (4.8 million). The existence of 
reports with numerous medications used (Multipharma), which is expected in patients with chronic 
conditions, is one of the challenges of using the FAERS database in ADRs prediction models. In such 
instances, untrustworthy associations are introduced to the data noise. We only utilised relationships 
in which the substance was identified as the prime suspect (PS) to address this problem (15,377,900). 
We counted the number of reports for each compound-adverse drug event combination, as well as the 
overall number of reports for both the compound in question and the adverse event [18]. 

The variables examined were derived entirely from the Adverse Drug Reaction Event Report Form. 
The variables in this study were encoded and allocated in line with the Adverse Drug Reaction 
Reporting and Monitoring Management Measures. According to the National Adverse Drug Reaction 
Monitoring Center regulations, among the reported ADRs, death; teratogenic, carcinogenic, or birth 
defect; permanent sequelae; permanent damage to organ function; leading to hospitalisation or 
prolonged hospital stay were classified as "Severe ADRs" (abbreviated as SADR), while other cases 
were classified as "Normal ADRs." Concerning the difference between medical institutions [19]. This 
study refers to the Ministry of Health's measures for the administration of hospital grade institutions 
that primarily provide basic public health and basic medical services as primary hospitals, and other 
comprehensive medical institutions as non-primary hospitals in accordance with relevant national 
policies. All primary institutions are considered first-level hospitals; a secondary hospital is a regional 
hospital that provides comprehensive medical and health services to multiple communities as well as 
certain teaching and scientific research tasks; and tertiary hospitals are typically local and provincial 
hospitals. 
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3. Proposed Threshold based Combination Model 

For each ADR, we first labelled the chemicals with the highest significant connections as positive (p- 
value threshold 0.05) and the compounds with the least significant associations as negative. We 
created a prediction model by balancing the quantity of positive and negative chemicals and retrieving 
the corresponding gene expression patterns [20]. The training and validation sets were then created 
by enforcing two criteria: 1) the data-sets were balanced, i.e., the number of positive and negative 
samples in both sets was equal, and 2) no chemicals were widely shared across training and validation. 
The amount of samples connected with specific substances was significantly varied, making normal 
cross-validation impossible to use. To get around this constraint, we switched the compounds between 
training and validation and sampled different training and validation set combinations [21]. The best 
balanced configurations with training:validation ratios close to 80:20 were subsequently chosen. 

To avoid data leaks seen between validation and training sets, feature selection was limited to the 
training set. As a result, the validation data was only used to select the highest performing models. 
For feature selection, we utilised Boruta implementation in Python library with default settings. 
Important characteristics are the factors (genes in this case) that are required to identify the samples 
as either positive or negative [22]. Using such essential characteristics for categorization reduces 
dimensionality of data. Moreover, these significant traits (genes) can give substantial insights into the 
biological process under research. Boruta creates new shadow variables by mixing the values of the 
original features, and these additional shadows variables are added to reduce the influence of 
randomness and improve the precision of feature selection. 

The outcomes of the research were ADR results as well as the impact on the original disease. In the 
binary logistic regression, the model defined a favourable prognosis as patients who recovered from 
ADRs and showed no significant influence on the pre-existing illness, and a poor prognosis as patients 
who did not improve or had a deterioration of the original condition. Each model is made up of three 
types of layers: input, output, and hidden. Optuna was used to tune hyperparameters. Optuna 
optimises using the trial and error approach, randomly assigning values to model hyperparameters 
from a set of values or options provided by the user for a certain number of trials [24]. Following 
that, the outcomes of all experiments may be analysed to discover the best parameters. densely linked 
layers (number of hidden layers, abbreviated as DNNdepth); potential values are (1, 2, 5, 10, and 30). 
"width": the number of nodes per layer; the potential values were (100, 250, 500, and 700). We 
employed two approaches to decrease the possibility of overfitting. We employed two approaches to 
decrease the possibility of overfitting. The initial "drop" step is to remove certain nodes before 
proceeding to the next layer. It used one of these numbers (0.2, 0.3, 0.4, or 0.5), where 0.2 signifies 
that 20% of the nodes are discarded. The second metric is noise introduction: the value of added 
Gaussian noise (0.2, 0.3, 0.4, and 0.5). Other hyperparameters included: "activation": relates to the 
final layer's activation function (output layer); potential values: ("sigmoid," "linear"), "learning rate" 
for Adam optimizer was chosen from among (0.001, 0.0005, and 0.00001). We picked the models 
with the greatest validation set accuracies. The maximum number of epochs was 800; however, if the 
accuracy did not increase after 75 epochs, the "early halting" technique was used. 

The model eliminated the necessity of p-out or k-fold cross validation protocols on the gene 
expression samples because: 1) the training and validation sets should be assigned based on 
compound segregation, i.e., the same compound should not span training and validation sets; and 2) 
the number of samples varied from compound to compound, making it impossible to create balanced 
sets. Instead, we used the strategy of generating three distinct training and validation sets. We also 
selected features for each of the specified combinations. Overfitting is a major problem in machine 
learning, especially when the amount of data is minimal. We sought to reduce overfitting by 
combining many techniques, one of which is early stopping, which ends training as the model gets 
more particular. The report's year, age, gender, proportion of significant adverse reactions, and 
adverse reaction outcomes were submitted to descriptive analysis and the chi-square test. The logistic 
regression method was used to investigate the parameters influencing the prognosis of SADRs at 
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various levels of medical facilities. The SPSS 24.0 programme was used for all data analysis (IBM 
Corp. Armonk, NY). A p-value of 0.05 or less was judged statistically significant. 

According to the National Adverse Drug Reaction Monitoring Center regulations, among the reported 
ADRs, death; teratogenic, carcinogenic, or birth defect; permanent sequelae; permanent damage to 
organ function; leading to hospitalisation or prolonged hospital stay were classified as "Serious 
ADRs" (abbreviated as SADR), while other cases were classified as "Normal ADRs." This research 
relates to the Ministry of Health's methods for the management of hospital grade facilities that 
primarily provide fundamental public health and basic medical services as primary hospitals, and 
other comprehensive medical institutions as non-primary hospitals in compliance with applicable 
national laws. 


4. Results and Discussions 
Among the 394037 ADR reports, Figs. 1 and 2 reveal that 52.3% of the patients (206 042) are women, 
whereas the remainder are men (187 473). The gender difference is not substantial. Approximately 
93.6% of patients are of Han origin, and 36.3% are above the age of 60. 1.46% of the population 
(5673) had two or more ADRs. 60.5% (238 545) of ADRs occur on the day of medication, according 
to the prevalence of ADRs in the medication process. Approximately 94.7% of ADRs occur within 
one week of treatment, with only 0.9% occurring beyond one month. The performance of the 
validation set prediction was used to assess model performance. We calculated the validation set's 
and area's accuracy. For enrichment analysis and gene annotation, the TargetMine data analysis 
platform was employed. KEGG, Reactome, and NCI databases were utilised for enrichment analysis. 
TargetMine generated p-values using the one-tailed Fisher's exact test. Benjamini Hochberg multiple 
test correction was used, with a p-value significance level of 0.05. We screened out low 
quality/unsuitable samples to prevent data dispersion caused by numerous dosage levels and 
administration periods (sacrifice period). To accomplish so, we utilised Lasso to divide the samples 
into treatment and control groups. A total of 6,619 of the 10,573 treated samples were categorised as 
controls and finally removed, with the majority of them falling into the "Low" dosage level group. 
The samples that were appropriately categorised as treated (3,953 samples) were kept for further 
examination. 
To predict the likelihood of ADRs, we used a unique technique that incorporated toxicogenomics 
gene expression profiles taken from Open TG-GATEs and ADRs reports extracted from FAERS. 
This combination of two very different data sources enabled us to correctly estimate ADRs. 
Furthermore, it resulted in the creation of a unique dataset that linked drug-induced gene expression 
patterns to ADRs. We first attempted to extract the individual drug-induced gene expression signature 
from Open TG-GATEs in order to overcome the considerable obstacles in merging the two datasets. 
Then, to merge the two datasets, we retrieved the ADR incidence frequencies for these medicines and 
calculated their statistical significance. 

Table.1. Medications and their side effects 


Medications and their effects 


Effects of Medication Mild Reaction Severe Reactions 
Expected Recovery 351 223 24698 

No improvement 5638 7850 

ADR 1296 426 

Fatality (0) 39 

No Effects 195 418 


Furthermore, the drug-induced gene expression patterns were very noisy due to different dosage 
levels and sacrifice durations, as well as the occurrence of recurrent and single treatment events. 
Using Lasso to filter out the noise, we created a basic model that classified all of the data as either 
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control or treatment. We used rigorous statistical analysis to select down appropriate samples for 
further analysis. Deep learning has recently gained popularity in the realm of drug development. In 
this work, we employed deep learning in conjunction with feature selection to minimise data 
dimensionality and avoid overfitting owing to small sample sizes. Previously, several cell lines were 
used to create prediction models for a variety of ADRs. Another study found that blood 
transcriptomics might be utilised to investigate different organs. According to Figure 1, it is estimated 
that the years 36-80 are highly prone to the effects of Adverse Drug Reactions. 
Number of ADR from 2015 to 2020 
Upto 6 
Above 80 yeas 


Year 7 to 18 


Year 19-35 
Year61-80 


Year 36-60 


Fig 1. Number of ADR Registered cases from 2015-2020 
This hypothesis has been reinforced by our demonstration of robust prediction models with excellent 
accuracy utilising liver samples. Because the liver is a critical organ for drug processing and gets a 
substantial volume of blood, it is commonly employed in drug toxicity studies. Furthermore, even in 
the absence of severe reactions to chemical toxicity, cell gene expression patterns differ. In contrast 
to another study that used data from the LINCS database, which is a collection of in vitro gene 
expression profiles from human cell lines, this study used in vivo gene expression data. Our method 
is easily transferable to other publicly available toxicogenomics data collections, such as those from 
Drug Matrix. Another distinction is the combination of chemical structure and Gene Ontology (GO). 
Recently published studies on certain ADR or systems investigated in this study. 


— train 
validation 


loss 


0 100 200 


Fig 2. Epoch and Loss of Training and Testing Data 

A noteworthy research used data from many sources to predict drug-induced liver impairment, and 
the results were equivalent to ours. Their developed models had an AUC of roughly 0.86. Their 
technique, however, is limited to drug-induced liver impairment. They also used chemical structures 
and protein-related data, among many other forms of data. Another research, with an AUC of 0.97, 
predicted gastric ulcers using gene expression data from the LINCS database. They also compared 
the use of gene expression alone to the addition of new information to the same model. Because using 
the Optuna optimization software made the building of our prediction models computationally costly, 
only a limited number of models were generated. Figure 2 indicates the training and test epochs along 
with the loss functions considered for the study. 
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CONCLUSION 

Using the publicly accessible Open TG-Gates and FAERS datasets, we created 14 deep learning 
models to predict adverse medication events. Certain models may be used to determine if a new 
medication candidate is capable of causing these adverse effects. Furthermore, alternative models for 
various ADRs may be developed using the same feature selection, model development, and tuning 
methods. One of the study's drawbacks is the relatively limited number of samples, as well as the 
varying doses and durations of drug exposure. The result of these constraints, which display changing 
curves. As a result, while a few models have lower correlations with ADR prediction, the majority 
have correlation values larger than 50%, as indicated by their Matthews correlation coefficient plots. 
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